Size Bounds for Conjunctive Queries with General Functional Dependencies

نویسندگان

  • Gregory Valiant
  • Paul Valiant
چکیده

This paper extends the work of Gottlob, Lee, and Valiant (PODS 2009) [9], and considers worst-case bounds for the size of the result Q(D) of a conjunctive query Q to a database D given an arbitrary set of functional dependencies. The bounds in [9] are based on a “coloring” of the query variables. In order to extend the previous bounds to the setting of arbitrary functional dependencies, we leverage tools from information theory to formalize the original intuition that each color used represents some possible entropy of that variable, and bound the maximum possible size increase via a linear program that seeks to maximize how much more entropy is in the result of the query than the input. This new view allows us to precisely characterize the entropy structure of worst-case instances for conjunctive queries with simple functional dependencies (keys), providing new insights into the results of [9]. We extend these results to the case of general functional dependencies, providing upper and lower bounds on the worst-case size increase. We identify the fundamental connection between the gap in these bounds and a central open question in information theory. Finally, we show that, while both the upper and lower bounds are given by exponentially large linear programs, one can distinguish in polynomial time whether the result of a query with an arbitrary set of functional dependencies can be any larger than the input database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy Bounds for Conjunctive Queries with Functional Dependencies

We study the problem of finding the worst-case size of the result Q(D) of a fixed conjunctive query Q applied to a database D satisfying given functional dependencies. We provide a characterization of this bound in terms of entropy vectors, and in terms of finite groups. In particular, we show that an upper bound provided by Gottlob, Lee, Valiant and Valiant [9] is tight, and that a corresponde...

متن کامل

Sensitivity of Counting Queries

In the context of statistical databases, the release of accurate statistical information about the collected data often puts at risk the privacy of the individual contributors. The goal of differential privacy is to maximise the utility of a query while protecting the individual records in the database. A natural way to achieve differential privacy is to add statistical noise to the result of t...

متن کامل

Discovery and Application of Functional Dependencies in Conjunctive Query Mining

We present an algorithm for mining frequent queries in arbitrary relational databases, over which functional dependencies are assumed. Building upon previous results, we restrict to the simple, but appealing subclass of simple conjunctive queries. The proposed algorithm makes use of the functional dependencies of the database to optimise the generation of queries and prune redundant queries. Fu...

متن کامل

Computing Supports of Conjunctive Queries on Relational Tables with Functional Dependencies

The problem of mining all frequent queries on a relational table is a problem known to be intractable even for conjunctive queries. In this article, we restrict our attention to conjunctive projection-selection queries and we assume that the table to be mined satisfies a set of functional dependencies. Under these assumptions, we define and characterize two pre-orderings with respect to which t...

متن کامل

Comparing and Mining Conjunctive Queries from a Relational Table with Functional Dependencies

In this paper we study the problem of mining all frequent queries in a relational table, a problem known to be intractable even for conjunctive queries. We restrict our attention to projectionselection queries and we assume that the table to be mined satisfies a set of functional dependencies. Under these assumptions, we define two pre-orderings with respect to which the support measure is show...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/0909.2030  شماره 

صفحات  -

تاریخ انتشار 2009